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41 Requirement-based data cube schema design 

2) David W. Cheung , Bo Zhou , Ben Kao , Hongjun Lu , Tak Wah Lam , Hing Fung Ting 
Proceedings of the eighth international conference on Information and knowledge 
management November 1999 

On-line analytical processing (OLAP) requires efficient processing of complex decision support 
queries over very large databases. It is well accepted that pre-computed data cubes can help 
reduce the response time of such queries dramatically. A very important design issue of an 
efficient OLAP system is therefore the choice of the right data cubes to materialize. We call this 
problem the data cube schema design problem. In this paper we show that the problem of 
finding an op ... 
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conference on Management of data June 1998 

Volume 27 Issue 2 

Database researchers have made significant progress on several research issues related to 
multidimensional data analysis, including the development of fast cubing algorithms, efficient 
schemes for creating and maintaining precomputed group-bys, and the design of efficient 
storage structures for multidimensional data. However, to date there has been little or no work 
on multidimensional query optimization. Recently, Microsoft has proposed “OLE DB 
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ACM SIGMOD Record September 1997 
Volume 26 Issue 3 

A data cube is a popular organization for summary data. A cube is simply a multidimensional 
structure that contains at each point an aggregate value, i.e., the result of applying an aggregate 
function to an underlying relation. In practical situations, cubes can require a large amount of 
storage. The typical approach to reducing storage cost is to materialize parts of the cube on 
demand. Unfortunately, this lazy evaluation can be a time-consuming operation. In this paper, 
we desc ... 
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Surajit Chaudhuri , Umeshwar Dayal 

ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 
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OrKCine Analytical Processing (OLAP) and Data Warehousing are decision support 
technologies. Their goal is to enable enterprises to gain competitive advantage by exploiting the 
ever-growing amount of data that is collected and stored in corporate databases and files for 
better and faster decision making. Over the past few years, these technologies have experienced 
explosive growth, both in the number of products and services offered, and in the extent of 
coverage in the tra ... 
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Brief Summary Text (9) : 

As data warehousing becomes more popular, OLAP is gaining in importance as a 
primary interface to evaluating data contained in the data warehouse . Most 
successful data mining applications include reporting systems having fast query 
response mechanisms. Most corporations require decision support and would benefit 
from improved technology to help in making decisions based upon rapidly gathered 
and organized data. 
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Description 

On-Line Analytical Processing (OLAP) and Data Warehous- 
ing are decision support technologies. Their goal is to enable 
enterprises to gain competitive advantage by exploiting the 
ever-growing amount of data that is collected and stored in 
corporate databases and files for better and faster decision 
making. Over the past few years, these technologies have 
experienced explosive growth, both in the number of prod- 
ucts and services offered, and in the extent of coverage in 
the trade press. Vendors, including all database companies, 
are paying increasing attention to all aspects of decision sup- 
port. 

Decision support places some rather different require- 
ments on database technology as compared to traditional on- 
line transaction processing (OLTP) applications. OLTP ap- 
plications typically automate clerical data processing tasks 
such as order entry and banking transactions that are the 
bread-and-butter, day-to-day operations of an organization. 
These tasks are structured and repetitive, and consist of 
short, atomic, isolated transactions, which require detailed, 
up-to-date data, and read or update a few (tens of) records. 
Consistency and recoverability of the database are critical, 
and maximizing transaction throughput is the key perfor- 
mance metric. 

Decision support, in contrast, requires historical, sum- 
marised and consolidated data from many sources scattered 
through the enterprise. Data is extracted from these sources 
and loaded into a data warehouse, a large, integrated, rel- 
atively static, database that is often maintained separately 
from the organisation's operational databases. To facilitate 
complex analyses and visualisation, the data warehouse typ- 
ically supports a multidimensional model of data. Since 
data warehouses contain consolidated data, perhaps from 
several operational databases, over potentially long periods 
of time, they tend to be orders of magnitude larger than 
operational databases; enterprise data warehouses are pro- 
jected to be hundreds of gigabytes to terabytes in sue. The 
workloads are query intensive with mostly ad hoc, complex 
queries that can access millions of records. Query through- 
put and response times are more important than transaction 
throughput. 

Permission to make digital/hard copy of pan or all this work for 
personal or classroom use is granted without fee provided thai 
copies are not made or distributed for profit or commercial advan- 
tage, the copyright notice, the title of the publication and its date 
appear, end notice is given that copying is by permission of ACM, 
Inc. To copy otherwise, to republish, to post on servers, or to 
redistribute to lists, requires prior specific permission and/or a fee. 
SIGMOD '97 AZ.USA 

© 1997 ACM 0-89791-91 1-4/97/0005.. .$3.50 



Data warehouses might be implemented on standard or 
extended relational database management systems, called 
Relational OLAP (ROLAP) servers. These servers assume 
that data is stored in relational databases, using special 
database designs (star and snowflake schemas) to represent 
the multidimensional data model; special access methods 
and query processing techniques to efficiently map OLAP 
operations on the underlying relational database. Alterna- 
tively, multidimensional OLAP (MO LAP) servers may be 
used. These are specialised servers that directly store mul- 
tidimensional data in special data structures (e.g., arrays) 
and implement the OLAP operations over these special data 
structures. 

This tutorial provides a roadmap of data warehousing 
and OLAP technologies, with an emphasis on their new 
requirements. We describe back end tools for extracting, 
cleaning and loading data into a data warehouse; multi- 
dimensional data models and OLAP operations; front end 
client tools for querying and data analysis; server extensions 
for efficient query processing; and tools for metadata man- 
agement and for managing the warehouse. We survey the 
state of the art and mention representative products. In a 
recent overview paper, we have summarized the issues that 
are discussed in this tutorial [l]. 

The area opens up interesting research directions, with 
ties to past work in database systems, but with different as- 
sumptions and requirements. Only very recently, however, 
has the database research community started to address 
some of these issues. Research in data warehousing so far 
has focused primarily on query processing and view mainte- 
nance issues. There still are many open research problems. 
We describe some of these briefly. 

Outline 

1. Introduction 

• definitions, evolution, differences from OLTP, ar- 
chitectures 

2. Models and Tools 

e conceptual model for OLAP 

• front- end tools (e.g., multidimensional spreadsheets) 

• database design (e.g., star and snownakc schema) 

3. Database Server technologies for Decision Support Queries 

• specialized indexing and query processing tech- 
niques 
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t intelligent processing of aggregates 

• complex query processing 

• extensions to SQL 

• ROLAP vs MOLAP 

4. Other Services for OLAP/Data warehousing 

• data cleaning, loading and refresh 

• tools for warehouse, system and process manage- 
ment 

• metadata management and the role of repository 

5. State of Commercial Practice 

6. Research Issues 
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